Constant Partying: Growing and Handling Trees with Constant Fits

نویسندگان

  • Torsten Hothorn
  • Achim Zeileis
چکیده

This vignette describes infrastructure for regression and classification trees with simple constant fits in each of the terminal nodes. Thus, all observations that are predicted to be in the same terminal node also receive the same prediction, e.g., a mean for numeric responses or proportions for categorical responses. This class of trees is very common and includes all traditional tree variants (AID, CHAID, CART, C4.5, FACT, QUEST) and also more recent approaches like CTree. Trees inferred by any of these algorithms could in principle be represented by objects of class “constparty” in partykit that then provides unified methods for printing, plotting, and predicting. Here, we describe how one can create “constparty” objects by (a) coercion from other R classes, (b) parsing of XML descriptions of trees learned in other software systems, (c) learning a tree using one’s own algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variable selection bias in regression trees with constant fits

The greedy search approach to variable selection in regression trees with constant fits is considered. At each node, the method usually compares the maximally selected statistic associated with each variable and selects the variable with the largest value to form the split. This method is shown to have selection bias, if predictor variables have different numbers of missing values and the bias ...

متن کامل

Parallel Generation of t-ary Trees

A parallel algorithm for generating t-ary tree sequences in reverse B-order is presented. The algorithm generates t-ary trees by 0-1 sequences, and each 0-1 sequences is generated in constant average time O(1). The algorithm is executed on a CREW SM SIMD model, and is adaptive and cost-optimal. Prior to the discussion of the parallel algorithm a new sequential generation with O(1) average time ...

متن کامل

Reliability Measures Improvement and Sensitivity Analysis of a Coal Handling Unit for Thermal Power Plant

The present paper investigates the reliability and sensitivity analysis of a coal handling unit of a thermal power plant using a probabilistic approach. Coal handling unit is the main block of a thermal power plant and it is necessary for a good function of a power plant that its power supply, which is dealt in coal handling unit, must function continuously without any obstacle. The configurati...

متن کامل

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Gradient Boosting With Piece-Wise Linear Regression Trees

Gradient boosting using decision trees as base learners, so called Gradient Boosted Decision Trees (GBDT), is a very successful ensemble learning algorithm widely used across a variety of applications. Recently, various GDBT construction algorithms and implementation have been designed and heavily optimized in some very popular open sourced toolkits such as XGBoost and LightGBM. In this paper, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015